46 research outputs found
Particle swarm optimization with state-based adaptive velocity limit strategy
Velocity limit (VL) has been widely adopted in many variants of particle
swarm optimization (PSO) to prevent particles from searching outside the
solution space. Several adaptive VL strategies have been introduced with which
the performance of PSO can be improved. However, the existing adaptive VL
strategies simply adjust their VL based on iterations, leading to
unsatisfactory optimization results because of the incompatibility between VL
and the current searching state of particles. To deal with this problem, a
novel PSO variant with state-based adaptive velocity limit strategy (PSO-SAVL)
is proposed. In the proposed PSO-SAVL, VL is adaptively adjusted based on the
evolutionary state estimation (ESE) in which a high value of VL is set for
global searching state and a low value of VL is set for local searching state.
Besides that, limit handling strategies have been modified and adopted to
improve the capability of avoiding local optima. The good performance of
PSO-SAVL has been experimentally validated on a wide range of benchmark
functions with 50 dimensions. The satisfactory scalability of PSO-SAVL in
high-dimension and large-scale problems is also verified. Besides, the merits
of the strategies in PSO-SAVL are verified in experiments. Sensitivity analysis
for the relevant hyper-parameters in state-based adaptive VL strategy is
conducted, and insights in how to select these hyper-parameters are also
discussed.Comment: 33 pages, 8 figure
Feature-aware conditional GAN for category text generation
Category text generation receives considerable attentions since it is
beneficial for various natural language processing tasks. Recently, the
generative adversarial network (GAN) has attained promising performance in text
generation, attributed to its adversarial training process. However, there are
several issues in text GANs, including discreteness, training instability, mode
collapse, lack of diversity and controllability etc. To address these issues,
this paper proposes a novel GAN framework, the feature-aware conditional GAN
(FA-GAN), for controllable category text generation. In FA-GAN, the generator
has a sequence-to-sequence structure for improving sentence diversity, which
consists of three encoders including a special feature-aware encoder and a
category-aware encoder, and one relational-memory-core-based decoder with the
Gumbel SoftMax activation function. The discriminator has an additional
category classification head. To generate sentences with specified categories,
the multi-class classification loss is supplemented in the adversarial
training. Comprehensive experiments have been conducted, and the results show
that FA-GAN consistently outperforms 10 state-of-the-art text generation
approaches on 6 text classification datasets. The case study demonstrates that
the synthetic sentences generated by FA-GAN can match the required categories
and are aware of the features of conditioned sentences, with good readability,
fluency, and text authenticity.Comment: 27 pages, 8 figure
Distilling Universal and Joint Knowledge for Cross-Domain Model Compression on Time Series Data
For many real-world time series tasks, the computational complexity of
prevalent deep leaning models often hinders the deployment on resource-limited
environments (e.g., smartphones). Moreover, due to the inevitable domain shift
between model training (source) and deploying (target) stages, compressing
those deep models under cross-domain scenarios becomes more challenging.
Although some of existing works have already explored cross-domain knowledge
distillation for model compression, they are either biased to source data or
heavily tangled between source and target data. To this end, we design a novel
end-to-end framework called Universal and joint knowledge distillation (UNI-KD)
for cross-domain model compression. In particular, we propose to transfer both
the universal feature-level knowledge across source and target domains and the
joint logit-level knowledge shared by both domains from the teacher to the
student model via an adversarial learning scheme. More specifically, a
feature-domain discriminator is employed to align teacher's and student's
representations for universal knowledge transfer. A data-domain discriminator
is utilized to prioritize the domain-shared samples for joint knowledge
transfer. Extensive experimental results on four time series datasets
demonstrate the superiority of our proposed method over state-of-the-art (SOTA)
benchmarks.Comment: Accepted by IJCAI 202
Heuristics-Driven Link-of-Analogy Prompting: Enhancing Large Language Models for Document-Level Event Argument Extraction
In this study, we investigate in-context learning (ICL) in document-level
event argument extraction (EAE). The paper identifies key challenges in this
problem, including example selection, context length limitation, abundance of
event types, and the limitation of Chain-of-Thought (CoT) prompting in
non-reasoning tasks. To address these challenges, we introduce the
Heuristic-Driven Link-of-Analogy (HD-LoA) prompting method. Specifically, we
hypothesize and validate that LLMs learn task-specific heuristics from
demonstrations via ICL. Building upon this hypothesis, we introduce an explicit
heuristic-driven demonstration construction approach, which transforms the
haphazard example selection process into a methodical method that emphasizes
task heuristics. Additionally, inspired by the analogical reasoning of human,
we propose the link-of-analogy prompting, which enables LLMs to process new
situations by drawing analogies to known situations, enhancing their
adaptability. Extensive experiments show that our method outperforms the
existing prompting methods and few-shot supervised learning methods, exhibiting
F1 score improvements of 4.53% and 9.38% on the document-level EAE dataset.
Furthermore, when applied to sentiment analysis and natural language inference
tasks, the HD-LoA prompting achieves accuracy gains of 2.87% and 2.63%,
indicating its effectiveness across different tasks
Effective Action Recognition with Embedded Key Point Shifts
Temporal feature extraction is an essential technique in video-based action
recognition. Key points have been utilized in skeleton-based action recognition
methods but they require costly key point annotation. In this paper, we propose
a novel temporal feature extraction module, named Key Point Shifts Embedding
Module (), to adaptively extract channel-wise key point shifts across
video frames without key point annotation for temporal feature extraction. Key
points are adaptively extracted as feature points with maximum feature values
at split regions, while key point shifts are the spatial displacements of
corresponding key points. The key point shifts are encoded as the overall
temporal features via linear embedding layers in a multi-set manner. Our method
achieves competitive performance through embedding key point shifts with
trivial computational cost, achieving the state-of-the-art performance of
82.05% on Mini-Kinetics and competitive performance on UCF101,
Something-Something-v1, and HMDB51 datasets.Comment: 35 pages, 10 figure
Improving Relation Extraction with Knowledge-attention
While attention mechanisms have been proven to be effective in many NLP
tasks, majority of them are data-driven. We propose a novel knowledge-attention
encoder which incorporates prior knowledge from external lexical resources into
deep neural networks for relation extraction task. Furthermore, we present
three effective ways of integrating knowledge-attention with self-attention to
maximize the utilization of both knowledge and data. The proposed relation
extraction system is end-to-end and fully attention-based. Experiment results
show that the proposed knowledge-attention mechanism has complementary
strengths with self-attention, and our integrated models outperform existing
CNN, RNN, and self-attention based models. State-of-the-art performance is
achieved on TACRED, a complex and large-scale relation extraction dataset.Comment: Paper presented at 2019 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2019
Hough Transform Implementation For Event-Based Systems: Concepts and Challenges
Hough transform (HT) is one of the most well-known techniques in computer vision that has been the basis of many practical image processing algorithms. HT however is designed to work for frame-based systems such as conventional digital cameras. Recently, event-based systems such as Dynamic Vision Sensor (DVS) cameras, has become popular among researchers. Event-based cameras have a significantly high temporal resolution (1 μs), but each pixel can only detect change and not color. As such, the conventional image processing algorithms cannot be readily applied to event-based output streams. Therefore, it is necessary to adapt the conventional image processing algorithms for event-based cameras. This paper provides a systematic explanation, starting from extending conventional HT to 3D HT, adaptation to event-based systems, and the implementation of the 3D HT using Spiking Neural Networks (SNNs). Using SNN enables the proposed solution to be easily realized on hardware using FPGA, without requiring CPU or additional memory. In addition, we also discuss techniques for optimal SNN-based implementation using efficient number of neurons for the required accuracy and resolution along each dimension, without increasing the overall computational complexity. We hope that this will help to reduce the gap between event-based and frame-based systems
Task-generic semantic convolutional neural network for web text-aided image classification
In this work, we explore how to use external and auxiliary web text to improve image classification. The keystone of web text-aided image classification is the representation learning for these two modalities of data. In the recent decade, convolutional neural networks (CNN) as the core representation methods of images have become a commodity in computer vision community. On the other hand, the long reign of word vectors has the same wide-ranging impact on NLP for representation learning. Based on the pre-trained word vectors, we propose a novel semantic CNN (s-CNN) model for high-level text representation learning using task-generic semantic filters. However, the s-CNN model inevitably brings about surplus semantic filters to achieve better applicability and generalization in universal tasks. Moreover, the surplus filters may lead to semantic overlaps and feature redundancy issue. To address this issue, we develop the so-called s-CNN Clustered (s-CNNC) models that uses filter clusters instead of individual filters. Interacting with the image CNN models, the s-CNNC models can further boost image classification under a multi-modal framework (mm-CNN). In addition, we propose to use the external text information selectively in the mm-CNN network to alleviate the noise problem inherent in web text. We validate the effectiveness of the proposed models on six benchmark datasets, and the results show that our approaches achieve remarkable improvements